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Abstract 


This  paper  explores  the  question  of  how  spectral  profiles  are  represented  in  the 
auditory  system.  Using  profile  analysis  methods,  listeners’  sensitivities  to  changes  in 
spectral  peak  shapes  and  ripple  phases  were  measured.  Peak  shapes  were  uniquely 
described  in  terms  of  two  parameters:  a  symmetry  factor  (SF)  which  roughly  mea¬ 
sures  the  local  evenness  or  oddness  of  a  peak,  and  a  bandwidth  factor  (BWF)  which 
reflects  the  tuning  or  sharpness  of  a  peak.  Thresholds  to  changes  in  these  parameters 
(defined  as  £SF  and  ^BWF/BWF)  were  measured  together  with  the  effects  of  several 
manipulations  such  as  using  different  peak  levels,  varying  spectral  component  densities, 
and  randomizing  the  frequencies  of  the  peaks.  The  basic  result  that  emerges  is  that 
<*>SF  and  £BWF/BYVF  thresholds  are  largely  constant  regardless  of  the  standard’s  peak 
shape.  The  only  exception  occurs  for  the  narrowest  peaks  (smallest  BWF’s)  where  <*>SF 
thresholds  rise.  A  fundamental  conclusion  arising  from  these  data  is  that  peak  profiles 
are  represented  along  two  sensitive  and  largely  independent  axes:  peak  bandwidth  and 
symmetry  factors.  More  generally,  it  is  conjectured  that  for  an  arbitrary  spectral  pro¬ 
file  these  two  axes  correspond  to  the  magnitude  and  phase  of  a  Fourier  transformation 
of  the  profile.  In  this  light,  the  last  set  of  experiments  measured  listeners’  sensitivity  to 
ripple  phase  changes  in  sinusoidal  ripple  stimuli.  The  thresholds  obtained  are  similar 
in  value  and  trends  to  <5SF  thresholds. 

INTRODUCTION 

The  shape  of  the  acoustic  spectrum  is  a  fundamental  cue  in  the  perception  and 
recognition  of  complex  sounds.  It  is  largely  uncertain,  however,  how  this  spectrum 
is  represented  in  the  auditory  system,  and  what  specific  features  are  extracted  and 
emphasized  by  such  a  representation.  This  issue  was  explored  in  a  recent  series  of 
physiological  mappings  in  the  primary  auditory  cortex,  AI  [Shamma  et  al .,  1993].  The 
findings  from  these  experiments  revealed  that  the  responses  along  the  isofrequency 
planes  of  AI  potentially  encode  an  explicit  measure  of  the  locally  averaged  gradient 
of  the  acoustic  spectrum.  Several  other  response  features  have  also  been  mapped  in 
the  AI,  including  FM  directional  sensitivity  [ Shamma  et  «/.,  1993]  and  response  area 
bandwidth  and  tuning  [ Schreiner  and  Mendelson,  1990]. 

The  existence  of  such  ordered  maps  has  certain  perceptual  implications.  For  in¬ 
stance,  it  is  likely  that  the  perception  of  a  spectral  peak  (such  as  a  vowel  formant) 
would  be  significantly  affected  by  its  symmetry  and  bandwidth.  This,  in  turn,  sug¬ 
gests  that  in  characterizing  the  perceptual  quality  of  an  arbitrary  spectral  pattern,  one 
has  to  take  into  account  not  only  its  peaks’  frequencies  and  levels,  but  also  the  local 
gradients  around,  and  tuning  of,  the  peaks.  In  order  to  explore  further  this  and  other 
possibilities,  psychoacoustical  experiments  were  carried  out  to  test  directly  the  sensi¬ 
tivity  of  human  subjects  to  changes  in  spectral  peak  shapes.  Specifically,  our  aim  was 
to  measure  the  sensitivity  to  symmetry  and  bandwidth  changes  in  single  spectral  peaks 
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under  a  variety  of  conditions,  such  as  different  spectral  compositions,  peak  levels,  and 
peak  frequency  randomization. 

The  experiments  reported  here  are  similar  in  methodology  to  previously  reported 
profile  analysis  experiments  ([ Bernstein  and  Green ,  1987;  Bernstein,  Richards  and 
Green ,  1987;  Green,  Mason  and  Kidd,  1984;  Kidd,  Mason  and  Green ,  1984]).  They  also 
share  the  same  overall  goals  of  the  phonetic  distance  measure  experiments  described 
in  [ Assmann  and  Summerfield,  1989]  and  [Iilatt,  1982].  Our  experiments,  however, 
differ  from  previously  published  profile  analysis  experiments  in  the  choice  of  a  non-flat 
standard  (a  spectral  peak).  They  also  differ  in  the  nature  of  the  manipulations  applied 
to  it,  i.e.,  changes  in  bandwidth  and  symmetry,  rather  than  amplitude. 

These  two  deformations  of  the  peak  profile  are  somewhat  more  general  than  would 
appear  at  first  glance.  Specifically,  if  one  imagines  the  peak  profile  drawn  on  a  flat 
stretchable  square  sheet,  then  changing  the  bandwidth  is  equivalent  to  dilating  the 
profile  or  pulling  apart  the  opposite  sides  of  the  sheet.  Changing  the  symmetry  is 
approximately  analogous  to  pulling  apart  opposite  corners  of  the  sheet,  thus  causing  the 
profile  to  appear  skewed  or  tilted.  Clearly,  such  deformations  of  the  spectral  peak  can 
be  applied  to,  and  thresholds  measured  and  compared  for  any  arbitrary  profile  drawn 
on  the  sheet.  Moreover,  as  we  shall  elaborate  in  Sec.  V,  these  manipulations  of  the 
profile  can  be  precisely  defined  in  another  domain  -  the  Fourier  transform  domain  of  the 
profile.  This  view,  combined  with  the  physiological  evidence  and  the  psychoacoustical 
data  presented  here  regarding  subjects’  sensitivities  to  these  manipulations,  suggests 
that  it  is  the  transform,  and  not  the  profile  itself,  that  is  represented  in  the  central 
auditory  system. 

In  this  part  of  the  paper,  threshold  measurements  for  all  above  mentioned  manip¬ 
ulations  are  presented  and  critically  interpreted  in  the  context  of  two  profile  analy¬ 
sis  models  [Bernstein  and  Green,  1987;  Durlach,  Braida  and  Ito ,  1986].  In  Part  II 
[Vranic- Sowers  and  Shamma,  1993],  these  and  other  data  from  several  profile  analy¬ 
sis  experiments  [Bernstein  and  Green,  1987;  Green ,  1986;  Hillier,  1991]  are  integrated 
within  a  “ripple  analysis  model”  based  on  the  idea  that  the  auditory  system  internally 
represents  a  spectral  profile  in  terms  of  its  Fourier  transform. 

In  the  following  section,  the  acoustic  stimuli  and  general  experimental  procedures 
are  described  in  detail.  Then,  we  present  the  results  of  subjects’  sensitivities  to  changes 
in  the  symmetry  (Sec.  II)  and  bandwidth  (Sec.  Ill)  of  peak  profiles,  for  different  peak 
shapes,  levels,  and  spectral  densities.  Two  control  experiments  are  described  in  Sec. 
IV  in  which  the  relevance  of  pitch  cues  and  peak  energy  changes  in  the  above  dis¬ 
crimination  tasks  are  evaluated.  In  Sec.  V,  the  results  are  briefly  discussed  within  a 
general  theoretical  framework  and  further  experiments  with  rippled  spectra  are  per¬ 
formed  (Sec.  VI).  We  end  with  a  general  discussion  of  the  results  in  relation  to  other 
profile  analysis  experiments. 
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I.  GENERAL  PROCEDURE 

A.  Methods 

Sounds  were  generated  at  25  kHz  sampling  rate,  via  a  Data  Acquisition/Control 
Unit  -  HP3852A,  and  two  16  bit  2-Channel  Arbitrary  Waveform  DAC  -  HP44726A. 
They  were  low-pass  filtered  at  10  kHz  and  passed  through  an  equalizer  (IEQ  One/Third 
Octave  Intelligent  Programmable)  for  level  adjustment.  Before  presentation  to  listen¬ 
ers,  sounds  were  gated  for  a  110  ms  duration,  including  10  ms  rise  and  decay  ramps. 
Sounds  were  delivered  inside  an  acoustic  chamber  through  a  speaker  (ADS  L470),  i.e., 
without  headphones. 

A  two-alternative,  two-interval  forced  choice  adaptive  procedure  was  used  to  es¬ 
timate  the  thresholds.  Each  trial  consisted  of  two  110  ms  long  observation  intervals 
separated  by  500  ms  pause.  After  listener’s  response,  a  short  visual  feedback  was  pro¬ 
vided  and  a  new  trial  started  until  all  50  trials  that  comprise  one  block  were  presented. 

The  discrimination  task  for  spectral  peak  stimuli,  was  to  distinguish  between  the 
standard ,  which  did  not  change  over  a  block  of  trials,  and  the  signal,  which  resembled 
the  standard  except  for  an  adaptive  change  in  spectral  peak  shape  in  each  trial.  The 
step  size  was  defined  in  terms  of  changes  in  the  right  slope  of  the  peak  in  decibels,  and  it 
differed  across  the  testing  conditions.  For  spectral  sinusoidal  stimuli,  the  discrimination 
task  and  stimulus  parameters  are  described  in  Sec.  VI. 

On  the  first  trial  the  signal  was  three  step  sizes  away  from  the  standard.  On  each 
subsequent  trial  the  signal  was  changed  according  to  the  “two-down,  one-up”  procedure 
in  order  to  estimate  the  level  that  produces  70.7%  correct  answers  ({Levitt,  1971]).  The 
step  size  was  halved  after  3  reversals  and  the  threshold  was  estimated  as  the  average 
of  the  signal  across  the  last  even  number  of  reversals  excluding  the  first  three.  Signal 
and  standard  occurred  with  equal  a  priori  probability  in  one  of  the  two  intervals. 

The  overall  presentation  level  was  randomized  across  trials  and  within  a  trial  over 
a  20  dB  range  in  1  dB  resolution,  in  order  to  ensure  that  listeners  base  their  judgement 
on  a  change  in  spectral  shape  rather  than  on  absolute  level  change  in  a  particular 
frequency  band  ([Green,  1988]). 

The  results  reported  are  based  on  data  from  two  to  five  normal  hearing  subjects, 
depending  on  the  particular  test.  Subjects  were  trained  for  about  a  week  (four  days  a 
week,  60  -  90  minutes  per  day),  before  the  actual  recording  took  place. 

B.  Spectral  peak  stimulus  parameters 

Both  of  the  multicomponent  standard  and  signal  peak  profiles  consist  of  two  por¬ 
tions,  the  base  and  the  peak.  The  base  components  were  all  equal  in  amplitude  and 
they  were  added  in  phase  to  peak  components  of  different  symmetries  and  bandwidths 
to  form  peak  profiles  as  shown  in  Fig.  1.  The  peak  profile  was  defined  against  a  log¬ 
arithmic  frequency  axis  (cu)  in  octaves,  u  =  log2  (///G),  where  /  is  the  frequency  in 
(kHz),  and  fa  is  the  frequency  of  the  largest  peak  component.  The  peak  profile  is 
defined  in  terms  of  the  following  parameters  (Fig.  2(a)): 
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Sensitivity  to  changes  in 
spectral  shape 


standard 


symmetry 


bandwidth 


Figure  1:  (a)  Complex  waveform  consists  of  a  flat  base  and  a  peak  added  to  it.  Peak 
takes  different  symmetries  (b)  and  bandwidths  (c). 
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•  bj0  is  the  location  of  the  peak’s  maximum.  Since  the  peak  is  always  located  at  1 
kHz,  u>0  =  0. 

•  S  is  the  slope  of  the  profile  near  the  peak’s  maximum  (in  dB/octave).  For  u>  <  u0, 
S  =  L  (the  left  slope),  and  for  u>  >  u>0 ,  S  =  R  (the  right  slope). 

•  b(Lo)  —  b  is  the  flat  base  of  the  peak  profile. 

•  a{ui)  —  amax  ■  102o(“-a'°),  is  the  amplitude  of  the  peak  portion  of  the  profile.  arnax 
is  the  maximum  amplitude  of  the  peak  profile  (at  u  =  u>0).  It  is  also  defined  in 
dB  as  Amax  =  20  log10(^). 


Frequency,  kHz 


_i _ i _ i _ -^r  i  i  i  i  i  i  i  i  i  —  i  _ i _ l 

-2  -1.5  -1  -0.5  0  0.5  1  1.5  2 


co,  octave 

Figure  2:  (a)  Peak  profile  plotted  on  a  linear  (top)  and  logarithmic  (bottom)  amplitude 
scale.  Peak  level  {Amax)  is  15  dB,  and  BWF  —  0.1  and  SF  =  0. 
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Therefore,  the  overall  peak  profile  (on  a  linear  scale)  is  given  by: 


p(u>)  =  b(u)  +  a(<jj)  =  b  +  amax102°(w  UJo\ 


and  on  the  dB  scale: 

Wfl(w)  =  20  log10(6  +  amax10^(a;_Wo))  =  20  log10(6  (1  +  10'^+m(u'~u'o))). 

For  example,  the  peak  in  Fig.  2(a)  (plotted  on  linear  and  dB  scales)  is  15  dB  in  level 
(Amax)  with  slopes  L  =  60  dB/octave  and  R  =  —60  dB/octave  around  the  peak.  Note 
that  around  uia,  the  peak  profile  can  be  approximated  by: 

PdB(u)*  20  loglo(6-10^+i(—°))  =  20  log  106  +  Amax  +  S(u-u0), 

i.e.,  the  peak  has  approximately  a  triangular  profile  as  shown  in  Fig.  2(a). 

From  the  above  definitions,  the  amplitude  of  each  component  pi  in  the  stimulus  can 
be  computed  from: 

Pi  =  b+  amax  10-/(8~,o)),  for  i  <  i0, 

and 

Pi  =  b  +  amax10r(l~to\  for  i  >  i0, 

where  i  is  the  component  index,  iQ  is  the  index  of  the  highest  component  located  at 
the  peak’s  maximum,  l  =  (L/ 20)  •  ( M/N ),  r  =  (R/ 20)  •  ( M/N ),  M  is  the  frequency 
range  of  the  spectrum  in  octaves,  and  N  is  the  (odd)  number  of  components.  For  our 
centered  peaks  iQ  =  (N  +  l)/2. 

In  order  to  vary  the  shape  of  the  peaks,  the  peak  profile  was  parametrized  uniquely 
in  terms  of  a  symmetry  factor  (SF)  and  a  bandwidth  factor  (BWF).  These  parameters 
reflect  the  difference  and  the  average,  respectively,  of  the  slopes  around  the  peak.  They 
are  defined  as:  (1)  SF  =  (L  +  R)  /  (L  -  R);  (2)  BWF  —  3  (1/L  -  1/R)  octave.  Thus, 
the  peak  in  Fig.  2(a)  has  SF  =  0  and  BWF  =  0.1  octave.  Peaks  with  various  other 
SF’s  and  BWF’s  are  shown  in  Fig.  2(b)  covering  the  full  range  of  profiles  used  in  our 
experiments.  Conversely,  given  any  SF  and  BWF,  the  slopes  around  the  peak  can 
be  computed  as:  R  =  -6/(BWF  (1  +  SF))  dB/octave,  and  L  =  6/(BWF  (1  -  SF)) 
dB/octave.  Note  that  BWF  is  not  strictly  the  bandwidth  of  the  peak,  but  rather  is 
analogous  to  the  inverse  of  the  Q-factor  of  the  peak.  A  third  parameter  -  the  peak 
level  (Amax)  is  also  required  to  define  the  peak  completely  with  respect  to  the  baseline. 

To  make  the  spectral  peaks  asymmetric,  they  were  always  tilted  towards  higher 
frequencies  (or  to  the  right).  This,  together  with  choosing  the  peak  frequency  at  1  kHz 
and  limiting  the  range  of  BWF  values  under  0.4,  ensured  that  the  spectral  peaks  were 
located  above  500  Hz  where  the  cochlear  frequency  axis  is  assumed  largely  logarithmic. 
This  is  an  important  consideration  since  the  peak  shapes  used  were  explicitly  defined 
in  terms  of  spectral  slopes  along  such  an  axis.  The  range  of  SF  and  BWF  values  tested 
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BWF=0.1 


BWF=0.2 


BWF=0.4 


Figure  2:  (b)  Envelopes  of  various  peak  profiles  plotted  on  a  linear  amplitude  axis. 
Columns  share  the  same  BWF’s,  and  rows  share  the  same  SF’s.  Corresponding  left 
and  right  slope  values  (in  dB/octave)  are  shown  for  each  case. 


Figure  2:  (c)  SF’s  and  BWF’s  for  the  spectral  peaks  of  a  naturally  spoken  vowel  “aw”. 
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also  correspond  to  those  that  might  be  computed  from  the  spectral  envelope  of  speech 
sounds,  as  shown  in  Fig.  2(c). 

In  all  experimental  conditions,  standard  and  signal  consisted  of  N  =  11,  21,  or  41 
zero  phase  spectral  components  equally  spaced  on  a  logarithmic  scale  between  0.2- 
5  kHz,  ( u>  in  the  range  ±2.32  octaves),  i.e.,  M  =  4.64  octave  with  the  peak  always 
centered  at  1  kHz  (a;  —  0  octaves).  The  waveform  was  turned  on  10  ms  following  the 
onset  in  order  to  suppress  the  large  amplitudes  due  to  zero  phases.  No  other  phase 
conditions  were  tested  since  numerous  previous  experiments  have  shown  that  phase 
effects  on  signal  detection  are  minimal  ([ Bernstein ,  Richards  and  Green,  1987;  Green 
and  Mason,  1985]). 

C.  Spectral  peak  threshold  measures 

Threshold  measures  reported  here  were  derived  from  the  threshold  estimate  of  the 
signal  given  in  terms  of  the  right  slope.  This,  together  with  the  paradigm  conditions 
(a  constant  SF  or  BWF)  defines  uniquely  the  corresponding  left  slope,  and  therefore 
the  SF  and  BWF  of  the  peak  at  threshold. 

Two  types  of  measures  were  defined  and  computed:  (1)  The  first  is  in  terms  of  the 
amount  of  change  in  SF  or  BWF  needed  for  detection,  i.e.,  £SF  or  <*>BWF.  In  the  case  of 
BWF  change  tests,  thresholds  are  normalized  by  the  peak’s  BWF  (i.e.,  6BWF/BWF). 
(2)  The  second  measure  is  the  root-mean-square  of  the  change  in  peak  energy  needed 
for  detection  (see  Appendix  I).  It  is  referred  to  as  the  rms-threshold. 

The  two  types  of  threshold  measures  described  above  imply  different  detection 
models.  We  shall  emphasize  in  this  paper  the  presentation  and  interpretations  of  the 
first  type  of  threshold.  The  rms-thresholds  for  all  tests  are  compiled  in  Appendix  I, 
mostly  to  facilitate  comparisons  with  results  from  other  profile  analysis  experiments 
previously  reported. 

II.  DETECTION  OF  CHANGES  IN  SPECTRAL  PEAK  SYMMETRY 

For  all  testing  conditions  in  this  section,  peak  bandwidth  factor  (BWF)  was  kept 
constant  over  a  set  of  trials  so  that  both  standard  and  signal  were  of  the  same  BWF. 
This  forced  listeners  to  base  the  signal  detection  on  a  change  in  peak  symmetry  factor 
(SF). 

A.  Results  and  discussion 

1.  Dependence  on  symmetry  and  bandwidth  factors  of  the  standard 

A  41  component  complex  was  used  in  this  set  of  experiments.  The  peak  amplitude 
was  fixed  at  a  level  which  allowed  it  to  be  heard  clearly  (15  dB  above  the  baseline). 
The  detection  threshold  was  measured  for  standard  peaks  of  four  different  bandwidth 
factors  (BWF  =  0.1,  0.13,  0.2,  0.4),  and  five  different  symmetries  (SF  —  0,  0.1,  0.15, 
0.2,  0.4),  i.e.,  a  total  of  20  tests  were  run.  The  averaged  results  for  five  subjects  are 


presented  in  Figs.  3.  In  Fig.  3(a)  the  data  are  averaged  over  the  four  BWF’s  and 
plotted  against  SF.  In  Fig.  3(b),  they  are  averaged  over  the  five  SF’s  and  plotted  as  a 
function  of  BWF. 

The  fundamental  result  that  emerges  from  these  data  is  that,  in  the  range  of  SF’s 
and  BWF’s  tested,  the  detection  of  a  change  in  peak  symmetry  (6SF)  is  largely  inde¬ 
pendent  of  the  peak  shape  of  the  standard.  Thus,  <*>SF  does  not  vary  as  a  function  of 
SF  (Fig.  3(a)).  However,  there  is  a  slight  consistent  decrease  in  threshold  as  a  function 
of  BWF  (Fig.  3(b)).  This  is  mostly  evident  for  the  narrowest  peaks  as  <5SF  drops  by 
0.04  for  the  first  0.38  octave  change  in  BWF  (from  BWF  =  0.1  to  0.13),  and  by  0.03 
for  the  next  1.62  octaves  (from  BWF  =  0.13  to  0.4).  For  all  other  conditions,  the  6'SF 
at  threshold  is  near  0.11. 

Plots  of  the  rms-thresholds  of  these  tests  are  shown  in  Appendix  I.  They  are  inde¬ 
pendent  of  SF  and  BWF,  with  average  detection  threshold  at  ~  -8.5  dB. 

The  subjects  trained  relatively  quickly  to  distinguish  signal  from  standard  for  all 
test  conditions  above.  To  make  the  distinction,  they  reported  that  they  were  listening 
for  the  “higher”  sounding  complex  tone  (signal).  This  pitch-like  change  is  intrinsic  to 
the  symmetry  detection  task  as  defined  here,  because  the  signal  was  tilted  to  the  right 
from  the  standard,  i.e.,  towards  the  higher  frequencies.  This  “pitch”  effect  is  further 
explored  in  Sec.  IV  A. 

2.  Dependence  on  peak  amplitudes 

In  order  to  determine  how  the  detection  threshold  depended  on  peak  levels,  the 
tests  described  in  Sec.  II  A. 1  were  repeated  at  two  other  peak  levels:  10  dB  arid  20  dB 
above  the  baseline.  To  account  for  the  fact  that  two  new  subjects  participated  in  this 
series  of  tests,  experiments  at  15  dB  level  were  repeated  as  well.  A  total  of  9  different 
conditions  were  tested  at  each  peak  level:  three  SF’s  (0,  0.2,  0.4)  find  three  BWF’s 
(0.1,  0.2,  0.4).  The  data  obtained  are  presented  in  Figs.  4.  As  in  Figs.  3,  data  are 
averaged  over  the  BWF’s  in  Fig.  4(a)  and  over  the  SF’s  in  Fig.  4(b),  for  each  of  the 
levels. 

Two  conclusions  can  be  derived  from  these  data: 

(1)  The  same  trends  described  earlier  hold  regardless  of  peak  levels.  Thus,  except 
for  the  narrowest  peak,  all  6SF  thresholds  are  the  same  regardless  of  peak  shapes 
studied.  The  rms-thresholds  are  independent  of  peak’s  SF  and  BWF  (Appendix  I). 
Note  that  on  average,  the  two  subjects  here  exhibited  uniformly  higher  thresholds  than 
the  earlier  five  in  Sec.  II  A.l. 

(2)  6SF  thresholds  as  a  function  of  BWF  (Fig.  4(b))  deteriorate  faster  at  the  nar¬ 
rowest  peaks  with  decreasing  peak  level.  This  rise  is  largely  responsible  for  the  upward 
shift  in  the  mean  of  ^SF’s  in  Fig.  4(a)  with  decreasing  peak  levels.  The  overall  slight 
rise  in  thresholds  may  reflect  the  masking  of  the  peak  by  the  base,  which  presumably 
increases  for  lower  peak  levels. 
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Figure  3:  Symmetry  change  detection  £SF  thresholds  for  41  component  complex  and 
15  dB  peak  amplitude,  averaged  over  five  subjects  and:  four  BWF’s  in  (a),  and  five 
SF’s  in  (b).  The  <*)SF  threshold  measure  is  defined  as  the  change  in  SF  between  the 
signal  at  threshold  and  the  standard.  In  (b),  the  (*)SF  increases  for  the  narrowest  BWF. 
The  error  bars  are  the  standard  deviations  of  the  means. 
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Figure  4:  Symmetry  change  detection  dSF  thresholds  for  41  component  complex  and  3 
peak  amplitudes:  10  dB,  15  dB,  and  20  dB,  relative  to  baseline.  The  data  are  averages 
of  three  subjects  and:  three  BWF’s  in  (a),  and  three  SF’s  in  (b).  The  values  along 
the  ordinates  are  defined  as  in  Fig.  3.  The  large  error  bars  in  (a)  are  due  to  the  <^SF 
threshold  increase  at  the  narrowest  BWF  seen  in  (b).  Points  are  slightly  offset  along 
the  abscissa  for  clarity. 


3.  Spectral  density  dependence 

These  experiments  explored  threshold  dependence  on  the  spectral  density  of  the 
complex  while  keeping  total  base  bandwidth  constant  (0.2-5  kHz).  The  following 
signal  parameters  were  tested  with  four  subjects:  41,  21,  and  11  spectral  components, 
two  SF’s  (0,  0.4),  and  three  BWF’s  (0.1,  0.2,  0.4).  For  two  of  the  subjects,  additional 
SF’s  were  tested:  SF  =  0.1,  0.15,  and  0.2  in  the  41  component  tests,  and  SF  =  0.2  in 
the  21  component  case.  Peak  level  was  always  set  at  15  dB  above  the  baseline. 

Once  again,  all  ^SF  values  and  trends  described  earlier  largely  hold  regardless  of 
spectral  densities  (Figs.  5).  The  most  prominent  change  in  <5SF  thresholds  occurs  as 
a  function  of  spectral  density  at  the  narrowest  peak  (Fig.  5(b)).  The  threshold  here 
deteriorates  rapidly  as  the  spectral  density  decreases  and,  as  in  Figs.  4,  it  is  largely  this 
accelerated  rise  that  is  responsible  for  the  upward  shifts  in  the  mean  (5SF  in  Fig.  5(a). 

Note  that  the  rms-threshold  plots  in  Appendix  I  do  not  immediately  present  a 
comparable  picture  since  the  rms-threshold  directly  reflects  also  the  change  in  overall 
peak  energy  as  the  density  is  varied. 

III.  DETECTION  OF  CHANGES  IN  SPECTRAL  PEAK  BANDWIDTH 
FACTOR 

Experiments  described  in  this  section  measured  detectability  of  a,  change  in  spectral 
peak  shape  due  only  to  a  change  in  its  bandwidth  factor  (BWF),  while  holding  the 
symmetry  factor  constant.  In  this  sense,  these  experiments  complement  those  described 
earlier  in  Sec.  II.  For  each  test,  the  detection  threshold  was  computed  as  the  relative 
change  in  the  BWF  of  the  standard,  i.e.,  6BWF/BWF. 

A.  Results  and  discussion 

1.  Dependence  on  symmetry  and  bandwidth  factors  of  the  standard 

As  in  Sec.  II  A. 1,  a  41  component  complex  was  used  and  the  peak  level  was  kept  at 
15  dB  level  above  the  baseline.  Standards  of  three  different  bandwidth  factors  (BWF 
=  0.1,  0.2,  0.4)  and  five  different  symmetry  factors  (SF  =  0,  0.1,  0.15,  0.2,  0.4)  were 
used,  i.e.,  a  total  of  15  conditions.  The  average  value  of  the  thresholds  over  three 
subjects  are  plotted  in  Figs.  6.  The  plot  in  Fig.  6(a)  is  of  the  average  6BWF/BWF  as 
a  function  of  SF.  In  Fig.  6(b),  the  thresholds  are  plotted  as  a  function  of  BWF. 

The  basic  result  that  emerges  from  all  these  tests  is  that  the  detection  threshold 
for  a  BWF  change  is  the  same  regardless  of  peak  shape  (£BWF/BWF  ta  0.22),  over 
the  range  of  peak  shapes  studied.  The  corresponding  unnormalized  rms-thresholds  are 
shown  in  Appendix  I. 

Our  subjects  took  longer  to  train  for  this  task  than  for  the  symmetry  change  de¬ 
tection  task.  Furthermore,  the  BWF  rms-thresholds  are  in  general  higher  than  the  SF 
rms-thresholds.  During  the  tests,  subjects  reported  listening  tor  several  different  sound 
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figure  5:  Symmetry  change  detection  thresholds  for  41,  21,  and  11  component  com¬ 
plexes,  and  15  dB  peak  level,  averaged  over  four  subjects  and  three  BWF’s  in  (a)  and 
three  SF’s  in  (b).  Large  error  bars  in  (a)  are  due  to  ^SF  increase  for  the  narrowest  peak 
(BWF  =  0.1)  seen  in  (b).  Note  that  in  (b),  most  5SF  changes  with  spectral  density 
occur  at  the  narrowest  peak. 
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Figure  6:  Bandwidth  change  detection  ^BWF/BWF  threshold  for  41  frequency  com¬ 
ponents,  and  15  dB  peak  level,  averaged  over  three  listeners,  and  three  BWF’s  (0.1, 
0.2,  0.4)  in  (a),  and  two  SF’s  (0  and  0.4)  in  (b).  Data  are  slightly  offset  along  the 
abscissa  for  clarity. 


qualities,  e.g.,  pitch  and  sharpness  of  sound,  in  order  to  recognize  the  signal.  Some  of 
them  reported  changing  their  listening  strategies  depending  on  the  testing  conditions. 

2.  Dependence  on  peak  levels 

The  dependence  of  BWF  thresholds  on  peak  levels  was  examined  in  three  subjects 
over  the  following  conditions:  three  SF’s  (0,  0.2,  0.4),  three  BWF’s  (0.1,  0.2,  0.4),  and 
at  three  peak  levels  (10  dB,  15  dB,  20  dB).  Tests  at  15  dB  peak  level  were  repeated  to 
account  for  the  fact  that  two  new  subjects  participated  in  this  sequence  of  tests.  The 
(5BWF/BWF  thresholds,  first  as  a  function  of  SF  and  then  as  a  function  of  BWF,  are 
given  in  Figs.  7(a)  and  (b),  respectively. 

The  plots  confirm  that,  at  a  particular  level,  the  6BWF/BWF  threshold  is  largely 
independent  of  peak  shape.  However,  thresholds  do  vary  as  a  function  of  peak  level, 
but  mostly  at  lower  peak  levels.  For  instance,  on  average,  the  rate  of  threshold  rise  in 
going  from  the  20  dB  to  the  15  dB  peaks  is  less  than  half  of  that  seen  between  15  dB 
and  10  dB. 

3.  Spectral  density  dependence 

Dependence  of  BWF  thresholds  on  the  spectral  density  was  examined  for  the  15 
dB  peak  level  using  11,  21,  and  41  component  complexes.  The  average  results  of  three 
listeners,  using  two  SF’s  (0  and  0.4)  and  three  BWF’s  (0.1,  0.2,  0.4),  are  presented  in 
Figs.  8.  In  Fig.  8(a),  they  are  given  as  a  function  of  SF,  and  in  Fig.  8(b)  as  a  function 
of  BWF.  The  corresponding  rms-thresholds  are  shown  in  Appendix  I. 

Once  again,  <^BWF/BWF  thresholds  remain  constant  for  all  conditions  tested,  i.e. , 
regardless  of  peak  shape  and  spectral  density.  The  one  obvious  exception  is  at  the 
broadest  peak  for  the  11  component  case,  where  the  threshold  is  significantly  larger. 

IV.  TWO  CONTROL  EXPERIMENTS  FOR  SF  AND  BWF  CHANGE 
DETECTION 

In  this  section,  we  present  the  results  of  two  control  experiments.  In  the  first  we 
randomized  the  location  (frequency)  of  the  peak  between  signal  and  standard  in  order 
to  minimize  or  abolish  the  “pitch”  cues  that  may  underlie  the  detection  of  SF  and 
BWF  changes.  In  the  second  experiment  we  assessed  the  relative  contribution  of  the 
change  in  peak  energy  to  the  detection  threshold. 

A.  Effects  of  peak  frequency  randomization 

Numerous  experimental  results  have  suggested  that  the  detection  of  spectral  shape 
changes  may  in  some  cases  be  effectively  mediated  by  pitch  cues  associated  with  these 
spectral  changes  ([Berg,  Nguyen  and  Green ,  1992;  Fet.h,  O’Malley  and  Ramsey,  1982; 
Richards,  Onsan  and  Green ,  1989;  Stover  and  Feth,  1983]).  In  order  to  assess  the  pos¬ 
sible  contribution  of  such  a  pitch  cue  in  our  tests,  we  measured  the  effect  on  thresholds 
of  randomizing  peak  locations,  a  procedure  which  in  effect  destroys  the  pitch  cue.  The 
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Figure  8:  6BWF/BWF  thresholds  for  41,  21,  and  11  component  complexes,  and  15 
dB  peak  level,  averaged  over  three  subjects  and  three  BWF’s  in  (a)  and  two  SF’s  (b). 
Threshold  is  independent  of  spectral  density  for  all  but  the  broadest  BWF,  where  it 
increases  for  the  11  component  case. 


change  in  thresholds  was  then  compared  to  what  would  be  predicted  from  the  theoret¬ 
ical  strength  of  the  pitch  cue  computed  for  each  test  using  the  so-called  Ewaif  model 
(reviewed  briefly  in  Appendix  II). 

1.  Stimulus 

The  entire  spectral  content  was  randomly  shifted  in  order  to  prevent  listeners  from 
using  standard’s  and  signal’s  complex  pitches  for  spectral  shape  change  detection. 
Frequency  shift  was  achieved  by  randomly  changing  the  sampling  time  in  a  range  of  40 
ps  to  45  ^s,  in  steps  of  0.5  gs.  This  amounts  to  shifting  the  central  component  from 
1000  Hz  to  889  Hz,  and  all  the  other  components  accordingly  to  preserve  the  frequency 
spacing. 

Two  subjects  participated  in  SF  and  three  in  BWF  change  detection  series.  They 
were  tested  at  two  SF’s  (0,  0.4)  and  three  BWF’s  (0.1,  0.2,  0.4)  for  the  41  spectral 
density  signals,  and  15  dB  peak  level.  Thresholds  measured  are  presented  in  Tables 
1(a)  and  1(b),  for  the  SF  and  BWF  change  tests,  respectively.  In  each  table,  the  first 
and  second  rows  contain  the  detection  thresholds  for  the  non- randomized  (NR)  and 
randomized  (R)  peaks.  The  third  row  lists  their  differences  (NR-R).  The  next  two  rows 
are  the  computed  Ewaif  pitches  of  standard  (Fs<a)  and  signal  (Fs,^)  at  NR  thresholds. 
The  AF  row  shows  the  difference  of  the  previous  two.  The  last  row  is  the  relative 
pitch  difference  AF/F^a.  The  Ewaif  pitches  were  computed  for  zero  phases,  which 
corresponds  to  our  stimulus  condition. 

2.  Assessing  the  data  using  the  Ewaif  model 

In  order  to  assess  the  amount  of  a  pitch  cue  contribution  to  the  detection  of  changes 
in  our  stimulus,  the  following  two  arguments  were  used  (see  [Richards,  Onsan  and 
Green,  1989]  for  details): 

1)  If  the  detection  process  relies  primarily  on  a  pitch  cue  (as  defined  by  the  Ewaif 
model),  then  some  minimal  pitch  difference,  AF  ([Feth  and  Stover ,  1987]),  or  relative 
pitch  difference,  AF/Fsia  ([Richards,  Onsan  and  Green ,  1989]),  is  necessary  for  de¬ 
tection.  Therefore,  at  perceptual  thresholds  AF  or  AF/Fst[i  should  remain  relatively 
constant. 

2)  If  a  threshold  deterioration  occurs  due  to  the  uncertainty  in  the  randomized 
signal,  and  not  due  to  the  pitch  differences  across  the  testing  conditions,  then  it  should 
be  uniform  across  all  conditions.  Otherwise,  the  deterioration  probably  reflects  the 
effective  contribution  of  the  pitch  cue.  This  is  evaluated  by  the  change  in  values  of  the 
NR-R  in  Tables  I. 

3.  Results  and  discussion 

(i)  Effects  on  detection  of  SF  changes  (Table  1(a)) 

With  respect  to  the  first  argument  above,  it  is  clear  from  the  AF  and  AF/Fs4u 
values  in  Table  1(a)  that  not  all  pitch  cues  are  equal  at  threshold,  since  both  increase 
approximately  4-fold  over  the  SF’s  and  BWF’s  tested.  However,  the  rise  in  <*>SF  for  the 
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0  0.1 
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0.1 
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0.27 

0.27 

0.13 

0.15 

0.13 

0.11 

R 

0.36 

0.11 

0.27 

0.23 

0.17 

0.12 

-(NR  R) 

0.09 

0.17 

0.11 

0.08 

0.01 

0.01 

1290.52 

1327,10 

1227.73 

1329.10 

1223.27 

1 130.59 

siy 

1315.56 

1315.59 

1258.51 

1369.30 

1293.12 

1 195.99 

Al'==l'„„-l'a>-s 

-25.01 

-18.19 

-30.83 

-10.25 

-70.13 

-05.39 

Al'7F»t„  •  100 

-1.91 

-1.37 

-2.51 

-3.03 

-5.73 

-1.57 

Table  1:  (a)  Symmetry  factor  change  detection  threshold  (6'SF),  for  41  component 
complex  for  non-randomized  (NR)  and  randomized  (R)  spectra.  The  first  two  rows  are 
the  NR  and  R  <5SF  thresholds.  The  third  row  is  the  difference  of  the  first  two.  The 
forth  and  fifth  rows  are  the  computed  Ewaif  pitches  of  standard  (Fs(a)  and  signal  ( F,;i,y ) 
at  perceptual  threshold  levels  for  NR  condition,  for  zero-phase  components.  The  AF 
row  is  AF  =  Fs<a-Fs,-g.  The  last  row  is  the  relative  pitch  difference,  AF/FS<U. 

narrowest  peak  might  be  due  to  decreasing  pitch  cue.  This  is  further  supported  by  the 
data  with  respect  to  the  second  argument,  namely  that  the  randomization  affects  only 
the  6SF  thresholds  of  the  narrower  peaks.  Therefore,  the  evidence  here  suggest  that 
the  pitch  cue  may  be  effective  only  for  these  peaks. 

(ii)  Effects  on  detection  of  BWF  changes  (Table  1(b)) 

The  AF  and  AF/Fsta  values  vary  greatly  (approximately  7-fold)  across  the  SF’s 
and  BWF’s.  Note  also  a  change  in  sign  of  AF  across  various  testing  conditions.  This 
strongly  suggests  that  the  pitch  cue  plays  a  minimal  role  in  this  discrimination  task. 
Furthermore,  a  near  uniform  increase  of  the  thresholds  when  the  signal  is  randomized, 
supports  the  notion  that  it  is  due  to  an  uncertainty  effect  rather  than  an  abolishment 
of  a  pitch  cue. 

B.  Detecting  peak  energy  change  compared  to  a  BWF  change 
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2.99 

0.73 

0.95  -3.76 

-2.19 
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Table  1:  (b)  Bandwidth  factor  change  detection  threshold  (6BWF/BWF),  for  41  com¬ 
ponent  complex  for  non-randomized  (NR)  and  randomized  (R)  spectra.  The  table  is 
organized  as  Table  1(a).  Note  a  change  in  sign  of  AF  across  various  testing  conditions, 
which  may  explain  the  change  in  strategies  that  our  subjects  reported  in  performing 
this  task. 

In  all  the  BWF  change  tests,  the  peak  energy  was  not  equalized  as  the  peak  width 
was  altered.  So,  it  could  be  argued  that  the  dBWF/BWF  threshold  reflects  a  change 
in  the  energy  of  the  peak,  rather  than  in  the  BWF  per  se.  One  indirect  argument 
against  this  conclusion  is  that  the  rms-thresholds  for  BWF  changes  were  sometimes  7 
dB  worse  than  those  for  SF  changes  (e.g.,  in  Appendix  I,  compare  data  points  at  BWF 
=  0.4  in  Figs.  A3(b)  and  A6(b)).  If  the  tasks  were  purely  based  on  the  total  change  in 
energy,  then  the  two  thresholds  should  be  comparable. 

A  more  direct  rebuttal  of  this  hypothesis  is  provided  in  Figs.  9,  where  the  rms- 
thresholds  of  three  different  tests  are  compared.  In  tests  A  and  B,  the  BWF  of  the 
peak  was  changed  in  one  of  two  ways:  either  as  usual  through  a  change  in  the  width 
(Fig.  9(a),  test  A),  or  through  a  change  in  the  height  only  (test  B)1  of  the  peak.  In  test 

Bn  test  B,  for  each  peak  level  of  the  signal,  Asignai,  the  slopes  are  computed  as  Rsignai  = 

Asi  nal 

^ standard >  and  Lsignai  —  Rsignai ,  where  Rstandard  and  A]jiax  are  the  right,  slope  and  the 
peak  level  of  the  standard. 
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Figure  9:  (a)  Effects  of  BWF  changes  in  tests  A  and  B  (see  text),  and  peak  level 
changes  in  test  C  are  shown  by  dashed  lines,  for  a  standard  peak  with  BWF  =  0.2,  SF 
=  0,  and  Amax  =  15dB. 

C,  the  peak’s  BWF  was  kept  constant  and  the  rms-threshold  is  measured  for  changes 
in  the  peak  size,  and  not  its  shape.  In  all  three  tests,  41  component  stimuli  were  used 
with  a  starting  peak  level  of  15  dB.  Three  subjects  were  tested  at  two  SF  (0  and  0.4) 
and  three  BWF’s  (0.1,  0.2,  0.4). 

The  data  in  Fig.  9(b)  reveal  that  the  rms-thresholds  (and  tfBWF/BWF  thresholds) 
are  very  close  for  BWF  change  detection  tests  (A  and  B).  They  are  also  uniformly  and 
significantly  lower  (approximately  6  dB)  than  those  due  to  a  change  in  peak  size  alone 
(test  C).  The  conclusions  we  draw  are  that  (1)  a  BWF  change  is  a  more  effective 
feature  to  detect  than  just  scaling  the  peak,  and  that  (2)  the  relatively  small  changes 
in  peak  energy  associated  with  the  BWF  tests  (as  in  A  and  B)  are  unlikely  to  contribute 
significantly  to  the  BWF  change  detection  thresholds. 

V.  BROADER  INTERPRETATIONS  OF  SF  AND  BWF  CHANGES 

In  all  experiments  so  far,  the  changes  in  peak  shapes  were  parameterized  in  terms  of 
SF  and  BWF  changes.  There  is,  however,  an  equivalent  and  more  general  description 
of  these  two  manipulations.  For  instance,  a  four-fold  increase  in  BWF  (from  BWF  = 
0.1  to  0.4)  can  be  viewed  as  a  stretching  (or  a  dilation)  of  the  peak  profile  along  the 
tv-axis,  i.e.,  p(u>)  becomes  p(a  ■  u)  with  a  =  1/4  (see  Fig.  10(a)).  This  change  in  p(u>) 
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Figure  9:  (b)  The  rms-thresholds  for  the  three  tests  (A,  B,  and  C). 


can  be  equivalently  described  in  the  Fourier  transform  domain  of  the  profile.  Namely, 
if  -P(O)  is  the  Fourier  transform  of  p(u>),  then  dilating  the  profile  by  a  factor  a  causes 
its  transform  to  become  1/a  •  P( Ft /a)  (Fig.  10(b))2. 

The  change  in  the  SF  of  a  peak  p(u>)  can  be  also  expressed  in  terms  of  a  corre¬ 
sponding  (though  somewhat  less  intuitive)  modification  of  the  peak  transform  P(fl). 
Specifically,  if  a  small  constant  phase  angle  0o  is  added  to  the  phases  of  all  components 
of  the  transform  P( ff),  then  the  corresponding  profile  p(u> )  becomes  tilted  in  a  manner 
very  similar  to  that  caused  by  a  SF  change.  This  is  demonstrated  in  Figs.  10(c)  and 
(d)  for  three  SF’s  and  their  corresponding  0O  angles:  SF  =  0.05  (3°),  0.15  (9°),  and  0.3 
(18°).  (The  computations  are  in  Appendix  III). 

2The  units  of  ft  in  the  profile  transform  domain  are  in  terms  of  the  number  of  cycles  per  unit 
distance  (octave)  along  the  ui  axis.  For  instance,  a  sinusoidal  profile  with  ft0  =  2  cycle/octave  is  a 
sinusoidal  profile  whose  peaks  are  separated  by  1/2  octaves  along  the  ui  axis. 

The  magnitude  of  the  Fourier  transform  of  the  peak  profile  (Fig.  10(b))  is  for  ft  >  0  computed  as: 


1^)1  = 


Umax  20  BWF 
b  3(ln  10)  M 


|1  +  J  2irft  20/(3  In  10)  SF  BWF  +  (rrft  20/(3  In  10)  BWF)2(1  -  SF2)!"1, 


and  |P(0)|  =  1  + 


On,.,.  20  BWF 

b  3(ln  10)  M  • 
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Figure  10:  (a)  Peak  profiles  with  Arnax  =  15  dB,  BWF’s  0.1  and  0.4  (solid  lines), 
and  BWF’s  at  25%  detection  threshold  (dashed  lines),  (b)  Magnitude  of  the  profiles’ 
Fourier  transformations,  |P(fi)|.  The  effect  of  the  BWF  change  is  a  shift  in  magnitude 
(and  not  a  change  in  shape)  along  the  log  if  axis. 

The  above  interpretations  of  the  £BWF  and  <*>SF  imply  that  these  manipulations  can 
be  readily  applied  to  any  arbitrary  spectral  profile.  The  sensitivity  measurements  can 
then  be  directly  compared  across  different  profiles.  Specifically,  we  shall  be  interested  in 
comparing  the  dilation  (5BWF/BWF)  and  phase-shift  (<5SF)  thresholds  of  the  peaks  to 
those  of  sinusoidally  modulated  spectra,  or  ripples,  which  are  the  basis  functions  of  the 
Fourier  transform.  Dilating  a  rippled  spectrum  simply  changes  its  ripple  (or  envelope) 
frequency,  and  shifting  the  spectrum  along  the  w-axis  changes  its  phase  (Fig.  11). 
While  ripple  frequency-difference-limen  thresholds  were  measured  previously  ([Green, 
1986;  Hillier ,  1991]),  no  ripple  phase  sensitives  have  been  reported  in  the  literature. 
The  experiments  described  below  provide  these  measurements. 

VI.  PHASE  DIFFERENCE  LIMEN  EXPERIMENTS 

Sensitivity  to  ripple  phase  changes  was  measured  in  sinusoidally  modulated  profiles 
on  a  dB  amplitude  scale  (Fig.  11),  and  the  thresholds,  termed  phase-difference- limen 
(Pdl),  are  reported  in  units  ol  degrees. 
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Figure  10:  (c)  and  (d)  The  effects  on  changes  in  the  symmetries  of  a  peak  profile 
(BWF  =  0.4  and  SF  =  0)  due  to  adding  constant  phases  (3°,  9°,  and  18°)  to  its  Fourier 
transform. 


Figure  11:  A  sinusoidal  ripple  profile  with  ripple  frequency  of  2  cycle/octave,  and  15 
dB  peak-to- valley  amplitude  (computed  as  201oglo(am(i;(;/ramm)).  Its  16°  phase  shifted 
version  is  shown  in  dashed  lines. 
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A.  Stimulus 

For  all  testing  conditions,  the  number  of  frequency  components  was  161  (34  per 
octave),  and  the  frequency  components  were  equally  spaced  on  a  logarithmic  scale 
between  0.2-5  kHz.  The  starting  ripple  phase  was  kept  constant  at  zero  degrees  for 
the  data  reported  here.  Other  starting  phases  were  also  tested,  and  results  were  very 
similar.  The  peak-to- valley  ratio  was  defined  as  20  log  where  amax  and  «m,-n  are  the 

°mm 

peak  and  valley  amplitudes  of  the  sinusoid  (see  Fig.  11).  The  ripple  frequency  (ft)  was 
fixed  over  a  set  of  trials  at  0.25,  0.5,  1,  2,  or  4  cycle/octave,  for  15  dB  and  25  dB  peak- 
to- valley  ratios.  One  of  the  two  subjects  also  completed  the  test  for  8  cycle/octave, 
and  for  35  dB  peak-to- valley  ratio,  while  the  other  was  tested  at  2  cycle/octave  and  20 
dB  and  35  dB  levels. 

The  overall  intensity  was  varied  across  and  within  the  trials  over  a  20  dB  range  in 
1  dB  steps. 

B.  Results 

The  average  data  for  two  subjects  are  presented  in  Fig.  12(a)  as  a  function  of  ripple 
frequency,  for  two  levels.  The  results  show  that  thresholds  are  constant  below  about  2 
cycle/octave  at  both  levels  tested,  achieving  a  minimum  of  about  6°  for  the  larger  level. 
Phase  sensitivity  decreases  with  increasing  ripple  frequencies  beyond  2  cycle/octave. 

Figure  12(b)  depicts  the  data  for  individual  subjects  as  a  function  of  ripple  level. 
Thresholds  saturate  with  increasing  levels  at  all  ripple  frequencies  tested. 

C.  Discussion 

There  are  two  important  characteristics  of  the  data  in  Fig.  12(a).  The  first  is 
that  for  lower  ripple  frequencies  (B),  subjects  detect  a  constant  phase  shift  and  not 
a  constant  displacement  of  the  peaks,  as  is  probably  the  case  for  fl  >  2  cycle/octave. 
The  second  is  that  the  lowest  detectable  phase  shift  (6°)  is  very  close  to  the  phase 
shift  implied  by  the  (5SF  thresholds  (=  0.11)  measured  for  the  peaks  (Figs.  3).  The 
correspondence  between  these  two  thresholds  confirms  the  association  made  between 
them  as  explained  in  Sec.  V.  It  also  suggests  that  this  threshold  is  independent  of  the 
particular  spectral  shape  used.  The  implications  of  this  finding  are  discussed  in  more 
detail  in  Part  II  of  this  paper. 

VII.  GENERAL  DISCUSSION 
A.  Summary  of  basic  results 

The  experiments  described  here  measured  subjects’  ability  to  detect  changes  in 
the  symmetry  and  bandwidth  factors  of  spectral  peaks  under  various  conditions.  The 
choice  of  these  spectral  features  was  inspired  by  the  physiological  finding  that  the 
primary  auditory  cortex  encodes  explicitly  the  locally  averaged  gradient  ol  the  acoustic 
spectrum.  In  the  case  of  spectral  peaks,  the  local  gradient  is  directly  related  to  the 
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Figure  12:  (a)  Phase  difference  limen  threshold  (pdl)  as  a  function  of  ripple  frequency, 
for  15  dB  and  25  dB  peak-to-valley  amplitudes  (or  ripple  levels),  averaged  over  2 
subjects,  (b)  Individual  pdl  thresholds  at  three  ripple  frequencies  as  a  function  of  ripple 
level  (subject  1  was  tested  at  0.25  and  8  cycle/octave,  and  subject  2  at  2  cycle/octave). 
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symmetry  of  the  peak.  Since  the  shape  of  a  peak  can  be  effectively  described  by  its 
symmetry  and  bandwidth,  our  goal  was  to  examine  the  perceptual  sensitivity  of,  and 
interdependence  between,  these  features. 

The  basic  result  that  emerges  is  that  thresholds  to  changes  in  SF  and  BWF  are 
(with  one  exception)  approximately  constant  regardless  of  peak  shape  parameters  tested. 
Thus,  for  the  detection  of  SF  changes,  £SF  thresholds  are  near  0.11  for  all  SF’s  and 
almost  all  BWF’s  (Figs.  3).  The  exception  occurs  towards  the  narrowest  peak  (BWF 
=  0.1)  where  (1)  the  detection  threshold  increases  gradually  to  0.16  (Fig.  3(b)),  and 
(2)  pitch  cues  associated  with  this  detection  task  become  more  effective  (Sec.  IV  A. 3 
(i),  Table  1(a)).  For  the  detection  of  BWF  changes,  all  (7BWF/BWF  thresholds  remain 
constant  at  around  0.22  regardless  of  the  peak  shape  (Figs.  6). 

Also  measured  were  the  effects  of  two  additional  manipulations  that  did  not  change 
the  shape  of  the  peak:  (1)  change  in  the  peak  level  and  (2)  spectral  density  of  the 
complex.  For  the  first,  all  thresholds  maintain  the  same  trends  regardless  of  peak 
level.  Their  absolute  values,  however,  slightly  improve  for  higher  peak  levels  (Figs. 
4  and  7).  For  lower  peaks,  the  deterioration  in  £BWF/BWF  thresholds  accelerates 
with  decreasing  peak  levels.  It  is  possible  that  the  uniform  rise  in  threshold  is  medi¬ 
ated  by  increased  masking  effects  of  the  base  upon  the  smaller  peak.  For  the  second 
manipulation,  <*)SF  thresholds  increase  gradually  with  decreasing  densities  only  at  the 
narrowest  peak  (Fig.  5(b)),  whereas  6BWF/BWF  thresholds  deteriorate  only  for  the 
lowest  density  (11  components)  at  the  broadest  peak  (Fig.  8(b)). 

Finally,  rms-threshold  values  for  SF  and  BWF  detection  tasks  are  comparable  to 
other  profile  detection  tasks  (see  Appendix  I).  Furthermore,  they  are  significantly  lower 
than  rms-thresholds  of  changes  that  do  not  affect  peak  shape  (Fig.  9(b)). 

In  summary,  a  fundamental  conclusion  from  these  data  is  that  the  detection  of 
peak  shape  changes  can  be  parametrized  along  two  sensitive  and  largely  independent 
axes:  peak  SF  and  BWF.  This  result  lends  support  to  the  notion  that  the  underlying 
physiological  representation  of  these  two  features  of  a  peak  may  be  separated  along 
orthogonal  dimensions.  For  instance,  one  conjecture  might  be  that  the  SF  is  mapped 
explicitly  by  the  gradient  map  found  in  AI  {[Shamma  et  a/.,  1993]).  Then,  this  map  is 
duplicated  more  than  once,  each  at  a  different  scale  of  local  averaging  of  bandwidth, 
in  essence  providing  the  BWF  dimension.  While  a  physiological  substrate  for  such  a 
multiscale  representation  is  yet  unavailable  in  AI,  maps  of  gradually  changing  tuning 
in  the  response  areas  of  cells  along  the  isofrequency  planes  in  AI  are  in  harmony  with 
this  view  ([ Schreiner  and  Mendelson,  1990]). 

B.  Profile  analysis  models 

The  choice  of  a  threshold  measure  implies  an  underlying  profile  analysis  model. 
Such  a  model  based  on  the  6SF  and  5BWF/BWF  measures  is  described  in  detail  in  Part 
II  of  this  paper.  Here  we  apply  a  two  alternative  profile  analysis  models  which  have  been 
shown  to  perform  well  in  a  variety  of  detection  tasks.  Both  presume  that  profile  changes 
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are  conveyed  by  independent  channels  distributed  across  the  spectrum.  The  first  is  the 
channel  model  for  discrimination  of  broadband  spectra  proposed  by  [Durlach,  Braida 
and  Ito,  1986],  which  basically  combines  information  from  all  independent  channels. 
The  other  is  the  maximum  difference  model  described  in  [Bernstein  and  Green ,  1987], 
which  is  based  on  detecting  the  largest  difference  between  any  pair  of  components  in 
the  signal,  i.e.,  it  uses  only  two  channels  in  computing  the  thresholds.  We  examine 
how  these  two  models  predict  the  detectability  of  peak  shape  changes  by  monitoring 
the  constancy  of  the  index  d'  computed  at  perceptual  thresholds  at  various  SF  and 
BWF  combinations. 

1.  The  channel  model 

This  model  is  described  in  detail  in  [Durlach,  Braida  and  Ito,  1986;  Green ,  1988]. 
It  consists  of  N  noisy  channels  whose  variances  (u)  are  assumed  to  be  constant.  Some 
interdependence  between  the  channel  outputs  is  introduced  because  of  the  level  ran¬ 
domization  in  the  experiments.  The  uniform  roving  level  distribution  over  a  20  dB 
range  (an  —  5.6)  is  approximated  by  a  normal  distribution  of  an  =  5.  Further¬ 
more,  it  is  assumed  that  the  channel  variances  are  such  that  an.  •  N  a.  The 
level  difference  between  the  standard’s  and  signal’s  ith  component  is  defined  as  A;  = 
20  log ({pt) signal /(Pi) standard)-  These  assumptions  lead  to  d!  =  yfiE A2  -  (£A,)2)/0\ 
The  numerator  (or  d'a )  was  computed  at  perceptual  thresholds  for  different  testing 
conditions  (Tables  II)  and  at  the  limits  of  the  error  bars,  in  order  to  determine  its 
sensitivity  to  threshold  changes. 

For  <$SF  tests,  the  stimuli  are  “balanced”  (see  [Durlach,  Braida  and  [to,  1986]),  in 
that  X)  A;  w  0,  or  at  least  (X)  A;)2  <C  A2 .  The  channel  model  predicts  reasonably  well 
the  average  thresholds  as  a  function  of  peak’s  starting  symmetry  (SF)  (Table  11(a)).  It 
however  fails  to  predict  the  £SF  threshold  trends  as  a  function  of  BWF.  For  instance, 
to  maintain  a  constant  d'a,  the  average  ^SF  at  BWF  =  0.2  (21  component  stimulus) 
needs  to  be  larger  by  21%  («  0.20).  A  similar  decrease  in  threshold  is  necessary  at 
BWF  =  0.4  for  the  41  component  stimulus  (27%,  or  to  0.08). 

For  <S)BWF  tests,  all,  with  one  exception,  d'a' s  are  comparable  when  considering 
the  significant  overlap  due  to  the  error  bars  (Table  11(b)).  The  only  stimulus  for  which 
the  model  clearly  fails  is  the  broadest  symmetric  peak  (SF  =  0,  BWF  =  0.4)  for  both 
spectral  densities. 

The  model  also  fails  to  account  for  the  detection  thresholds  measured  in  the  control 
experiment  (C)  described  in  Sec.  IV  B.  Specifically,  it  predicts  higher  than  perceptual 
thresholds  for  the  narrowest  peaks  (Table  11(c)). 

Finally,  the  d'a  for  the  phase  data  (Table  11(d))  increases  with  increasing  threshold 
values  at  higher  ripple  frequencies.  This  is  true  for  both  15  dB  and  25  dB  levels.  The 
model  therefore  predicts  a  constant  pdl  instead  of  the  increasing  thresholds  seen  at 
higher  ripple  frequencies.  For  instance,  a  d'a  —  10.51  for  8  cycle/octave  and  15  dB 
level  stimulus,  would  predict  a  9°  threshold,  compared  to  the  49.58°  perceptual  value. 
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</V  for  £SF  test  (21 

8 

5 

0 

o 

mvi' 

si'’ 

0.1 

0.2 

0.1  average 

0 

1.16 

±0.13 

3.15  ±0.10 

1.36  ±0.52  3.99 

0.1 

3.68 

±0.07 

3.58  ±0.13 

1.81  ±0.30  1.03 

average 

1.07 

3.37 

1.60 

0.31 

±0.01 

0.17  ±0.01 

0.17  ±0.02 

threshold 

d'a  for  (s'SL'  test  (11  components) 
liVVi* 

SI*'  0.1  0.2  0.1  average 


0 

0.1 

2.96  ±0.19 

3.36  ±0.22 

2.88  ±0.13 

3.22  ±0.15 

3.97  ±0.18 

1.33  ±0.20 

3.27 

3.61 

average 

3.16 

3.05 

1.15 

7SI' 

0.16  ±0.01 

0.11  ±0.005 

0.11  ±0.005 

threshold 

Table  2:  d'a  values  for  the  “independent  channel  model”  (Sec.  VII  B.l).  (a)  d'a  for  6SF 
tests  for  21  and  41  component  spectra,  evaluated  at  threshold  and  error  bar  limit  values 
(in  brackets)  which  are  given  at  the  bottom  of  each  table.  (For  example,  for  BWF  = 
0.1  and  SF  =  0,  £SF  =  0.34  with  error  bar  limit  of  ±  0.01,  and  the  corresponding 
d'a  =  4.46  ±  0.13.)  Thresholds  are  from  Figs.  5(b)  and  3(b)  for  21  and  41  density 
cases. 
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d'a  for  tfUVVl1*  test  (21  components) 


si' 

0.1 

11  Wl' 
0.2 

0.1  average 

0 

3.01  ±0.27 

2.68  ±0.16 

1.98  ±0.11  2.57 

0.1 

3.03  ±0.27 

2.80  ±0. 19 

2.52  ±0.17  2.78 

average 

3.03 

2.71 

2.25 

tsmvi’/mvi' 

threshold 

30  ±3% 

25  ±5% 

25  ±2% 

/<r  for 

(SUVV11'  test  (11  components) 

13VV1- 

SP 

0.1 

0.2 

0. 1  average 

0 

3.11  ±0.21 

3.02  ±0.19 

2.58  ±0.21  2.90 

0.1 

3.12  ±0.25 

3.16  ±0.20 

3.27  ±0.25  3.18 

average 

3.11 

3.09 

2.92 

eSJJVVF/UWi- 

threshold 

21.5  ±1.8% 

20.0  ±1.1% 

23.6  ±2.1%. 

Table  2:  (b)  d'a  values  for  <*>BWF  tests.  The  table  is  organized  as  Table  11(a),  with 
threshold  values  from  Figs.  8(b)  and  6(b)  for  the  two  density  tests. 


30 


SF 

il’.'r  for  control  experiment  C 

uwf 

o.i  o.2  o.i 

average 

0 

1.22  ±0.86 

6.10  ±1.86 

5.16  ±1.28 

5.36 

threshold 

2.79  ±0.55 

3.19  ±0.98 

3.11  ±0.81 

0.1 

1.11  ±0.80 

6.51  ±1.02 

6.62  ±1.29 

5.85 

threshold 

2.91  ±0.51 

3.53  ±0.53 

3.63  ±0.69 

average 

1.31 

6.15 

6.01 

Table  2:  (c)  Similar  to  Table  II  (a)  for  control  experiment  C  (Fig.  9(b);  thresholds  are 
given  separately  for  the  two  SF’s). 


</'<r  for  phase  ripple  experiment 
ripple  frequency  { cycle/octave) 


0.25 

0.5 

1 

2 

1 

8 

d’a 

9.59 

10.11 

8.89 

9.17 

19.0 

19.58 

pdl,  15  dU 

8.09° 

9.08“ 

7.53“ 

8.1“ 

16.22“ 

•13.17“ 

d'a 

10.67 

10.20 

10.08 

11.32 

28.39 

pdl,  25  dU 

5.39“ 

5.31“ 

5.11“ 

7.3 1“ 

1 1.53“ 

Table  2:  (d)  d'a  for  pdl  tests  with  spectral  sinusoids,  for  15  dB  and  25  dB  peak  levels 
and  thresholds  from  Fig.  12(a). 
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Note  that  the  stimuli  here  are  “balanced”  as  in  the  6SF  tests. 

In  summary,  the  channel  model  predicts  reasonably  well  most  of  the  threshold 
trends  measured  in  our  experiments.  However,  we  can  discern  no  simple  pattern  to 
the  failures  since  they  occur  at  various  BWF’s,  and  are  apparently  unrelated  to  the 
number  of  stimulus  components  (for  the  two  cases  tested).  It  is  possible  that  some  ol 
the  simplifying  assumptions  are  invalid,  for  instance  the  constant  o  over  all  channels 
or  the  actual  number  of  independent  channels  used. 

2.  The  maximum  difference  model 

This  model  is  based  on  the  detection  of  the  maximum  level  difference  between 
only  two  spectral  regions.  The  model  was  derived  from  experimental  results  with  flat 
standards,  and  was  defined  accordingly  for  such  tests.  It  predicts  well  the  thresholds 
in  a  number  of  profile  analysis  tasks  [Bernstein  and  Green,  1987].  In  order  to  apply 
the  model  to  the  peak  stimuli,  the  computational  scheme  was  slightly  modified.  For 
instance,  we  define  the  level  difference  between  the  ith  and  jth  frequency  component  as 

Ajj  20  log(  (pi  signal  /  ^Pi  ^standard)  20  log  (  (  Pj  )  signal  /  [Pj  )  st.andar  li  )  ■  Also,  COlltrary  to 

the  assumption  of  the  original  model  ([ Bernstein  and  Green ,  1987]),  we  take  the  cr’s  to 
be  constant  for  all  channels,  and  hence  the  largest  d!  is  defined  by  the  largest  Av/ ,  or 

A. 

The  computed  A’s  for  the  (*>SF  tests  are  shown  in  Table  111(a).  As  a  function  of 
a  peak’s  BWF,  the  trends  are  well  predicted  for  the  larger  BWF’s  (0.2  and  0.4).  For 
the  narrow  peaks  (BWF  =  0.1),  the  model  predicts  smaller  threshold  than  is  observed. 
It  also  predicts  a  slight  dependence  of  the  thresholds  on  SF  where  none  exists  in  the 
data.  Note  that  including  a  variable  a  would  probably  worsen  the  predictions.  This  is 
because  for  broader  peaks  A  occurs  closer  to  the  edges  of  the  profile  where  a  is  larger. 
Consequently  d'  becomes  smaller  than  indicated  by  the  table. 

For  £BWF  tests,  A  is  approximately  constant  for  all  SF’s  and  BWF’s  except  for  the 
narrowest  peak  for  21  component  stimulus  (Table  111(b)).  Therefore,  with  the  assump¬ 
tion  of  constant  cr’s  across  all  spectral  regions,  the  model  predicts  well  the  perceptual 
trends.  However,  increasing  cr’s  towards  the  edges  (as  in  [Bernstein  and  Green,  1987]) 
would  cause  the  d!  to  decrease  with  increasing  BWF,  predicting  erroneously  higher 
thresholds  for  these  conditions. 

Predicted  threshold  trends  for  the  control  experiment  are  consistent  (Table  111(c)) 
with  the  experimental  ones,  since  A  values  appear  scattered  around  2.7  dB  for  all 
conditions.  They  are,  however,  larger  than  those  of  the  6BWF  tests,  indicating  that 
the  model  accounts  well  for  the  trends  in  the  thresholds,  but  not  for  their  absolute 
values. 

The  maximal  difference  for  the  ripple-phase  data  is  roughly  constant  at  lower  rip¬ 
ple  frequencies,  and  follows  the  threshold  increase  at  higher  ripple  frequencies  (Table 
111(d))).  This  is  similar  to  the  d' a  trend  in  the  channel  model.  Thus,  the  maximal 
difference  model  predicts  constant  thresholds  at  all  ripple  frequencies  contrary  to  the 
observed  data. 
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A  for  £SF  test  (21 

components) 

Sl-' 

0.1 

mvi-’ 

0.2  0.1 

average 

0 

5.12 

2.53  2.58 

3.11 

0.1 

•1.35 

3.29  3.31 

3.65 

average 

1.73 

2.91  2.91 

*SF 

threshold 

0.31 

0.17  0.17 

A  for  t>Sl'  test  ( 11  components-) 
liVVi- 

SF  0.1  0.2  0.1  average 


0 

2.39 

1.65 

1.06 

1.90 

0.1 

3.07 

2.01 

2.09 

2.10 

average 

2.73 

1.81 

1.87 

£SF 

threshold 

0.16 

0.11 

0.11 

Table  3:  Maximal  difference  levels,  A  (dB),  for  the  “maximal  difference  model” 
(Sec.  VII  B.2).  Tables  are  organized  as  Tables  II  (with  the  same  threshold  values 
as  in  Tables  II).  Table  3  (a) 
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SF 

A  for  #13  WF  test  (21  components) 

13  WF 

0.1  0.2  0.1  average 

0 

1.85 

1.61  1.68 

1.72 

0.1 

1.86 

1.66  1.68 

1.73 

average 

1.85 

1.65  1.68 

^mvi'/Livvi' 

30% 

25%  25% 

t  hreshold 

A  for  #11  WF  test  (11  components) 

13  WF 

SI' 

0.1 

0.2  0.1 

average 

0 

1.13 

1.37  1.59 

1.16 

0.1 

1.15 

1.37  1.59 

1.17 

average 

1.11 

1.37  1.59 

(Smvi'/B  WF 

21.5% 

20%  23.6% 

threshold 

Table  3:  (b) 
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si' 

A  for  control  experiment  C 

DWF 

0.1  0.2  0.1  average 

0 

2.12 

3.03 

2.61 

2.69 

£  Amax 
threshold 

2.79 

3.19 

3.11 

0.1 

2.53 

3.08 

2.83 

2.81 

Zinnur 

threshold 

2.91 

3.53 

3.63 

average 

2.17 

3.05 

2.72 

Table  3:  (c) 


A  for  phase  ripple  experiment 

ripple  frequency  (eyele/octavc) 

0.25  0.5  1  2  1  8 

A 

2.12 

2.38 

1.97 

2.12  1.23  11.11 

pdl,  15  dll 

8.09" 

9.08" 

7.53" 

8.1"  16.22"  13.17" 

A 

2.35 

2.32 

2.23 

3.16  6.32 

pdl,  25  dll 

5.39" 

5.31" 

5.11" 

7.31"  11.53" 

Table  3:  (d) 
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In  summary,  the  picture  that  emerges  from  applying  the  maximum  difference  model 
to  our  data  is  a  mixed  one.  For  instance,  the  model  clearly  accounts  for  several  of  the 
observed  trends  in  our  data,  especially  the  <5BWF  tests.  Using  variable  er’s  may  extend 
the  applicability  of  the  model  to  a  bigger  portion  of  the  tests,  but  it  clearly  destroys 
the  good  predictions  of  the  6BWF  tests.  The  pattern  of  prediction  errors  is  somewhat 
more  structured  than  for  the  channel  model,  in  that  the  model  seems  to  fail  mostly  for 
the  narrowest  peaks.  It  also  fails  to  predict  the  ^SF  independence  of  SF. 

C.  Ripple  analysis  model 

Both  the  channel  and  the  maximum  difference  models  described  above  account 
partially  for  the  data.  And  it  is  possible  that  one  or  both  of  models  can  be  made  to 
account  fully  for  the  data  with  enough  parameter  adjustments.  So  it  is  difficult  to  pass 
judgement  on  these  models  on  these  grounds. 

Both  models,  however,  have  been  reported  to  raise  fundamental  questions  when 
applied  to  a  different  stimulus  -  the  sinusoidal  ripple  [Bernstein  and  Green ,  1987; 
Green,  1986].  The  maximum  difference  model  predicts  well  the  detectable  amplitude 
of  the  ripple  [ Bernstein  and  Green,  1987].  The  model  also  correctly  predicts  that 
thresholds  are  independent  of  the  number  of  ripple  cycles  since  they  are  estimated 
from  a  single  pair  of  channels.  This,  however,  runs  directly  counter  to  the  premise 
of  the  channel  model  -  that  more  independent  channels  of  information  must  lead  to 
lower  thresholds  [Green,  1986].  The  success  of  the  maximum  difference  model  with 
rippled  (and  alternating)  profiles  therefore  raises  the  question:  Why  is  the  additional 
information  provided  by  other  independent  channels  not  used? 

One  way  to  resolve  this  difficulty  is  to  change  the  definition  of  the  independent 
channels.  For  instance,  if  one  thinks  of  the  maximum  level  difference  (from  a  pair  of 
channels)  as  the  amplitude  of  the  ripple,  then  adding  more  ripple  cycles  (and  hence 
more  channels)  does  not  add  new  information.  Thus,  an  alternative  definition  (or 
model)  of  the  channels  is  that  they  sense  the  amplitude  (and  perhaps  the  phase)  of 
ripples  of  different  frequencies.  Such  a  channel  does  not  measure  the  energy  difference 
at  one  point  in  the  spectrum,  rather  it  conveys  information  about  a  more  structured 
spectral  pattern  (e.g.,  the  ripple). 

This  “ripple  analysis  model”  implies  that  the  detection  strategy  of  the  spectral 
profile  is  not  applied  to  the  profile  directly,  but  instead  to  some  transformation  of 
the  profile.  Such  an  approach  is  not  unusual  -  afterall,  the  spectral  profile  itself  is  a 
transformation  from  the  time-domain  of  the  signal,  and  the  classical  channel  model  is 
applied  to  a  (Fourier)  transformation  of  the  signal  (the  spectrum).  An  elaboration  of 
this  idea  is  presented  in  Part  II  of  this  paper. 
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Appendix  I:  Detection  thresholds  measured  in  rms  units 

We  have  collected  in  this  Appendix  all  rms-thresholds  obtained  in  the  peak  SF  and 
BWF  change  detection  tasks.  The  rms-threshold  is  derived  from  SF  and  BWF  changes 
based  on  the  assumptions  that  all  changes  in  the  spectral  components  (which  are  largest 
around  the  peak)  contribute  equally  to  the  detection  process.  The  rms-threshold  is 
defined  as  20  log  / ai)2 ,  where  A is  the  change  in  the  amplitude  of  the  ith 

component  at  threshold,  a,-  is  the  amplitude  of  the  ith  component  in  the  standard,  and 
n  is  the  number  of  components. 

This  measure  is  closely  related  to  that  used  in  most  profile  analysis  experiments 
previously  reported.  Specifically,  for  the  case  of  a  flat  standard  of  amplitude  a,  the 
measure  usually  used  is:  20  log(  rms^g^°i  y  This  measure  converts  to  our  units  if  we  add 
a  constant  10  log  n ,  which  accounts  for  the  number  of  signal  components  n.  Thus,  the 
threshold  for  the  single  increment  detection  task  (n  =  1)  is  the  same  under  both  units. 
For  other  commonly  studied  detection  tasks  in  profile  analysis:  21  components  step 
function  at  1  kHz,  alternating  amplitudes  spectrum,  and  ripple  signals;  the  threshold 
values  reported  are:  -23.33  dB,  -23.07  dB,  and  -21.58  dB,  respectively  (see  [Green, 
1986;  Richards,  Onsan  and  Green ,  1989]).  Computed  in  our  unit,  the  thresholds  are: 
-10.11  dB,  -10.06  dB,  and  -8.36  dB,  respectively. 

In  order  to  facilitate  the  comparison  with  corresponding  figures  in  the  paper  using 
different  threshold  measures,  we  shall  use  the  same  figure  numbers  as  before  except  for 
an  additional  prefix  (A). 

1.  Detection  of  changes  in  spectral  peak  symmetry 

Stimuli  and  testing  conditions  are  described  in  Sec.  II. 

(i)  Dependence  on  symmetry  and  bandwidth  factors  of  the  standard 

Threshold  trends  are  as  described  earlier  in  Sec.  II  in  that  detection  of  a  change 
in  peak  symmetry  is  independent  of  the  peak  shape  of  the  standard  (Figs.  A3).  The 
average  detection  threshold  was  ^  -8.5  dB.  This  value  is  comparable  to  that  measured 
in  other  profile  analysis  experiments. 

(ii)  Dependence  on  peak  amplitudes 

Data  are  averaged  and  presented  as  in  Sec.  II.  The  same  trends  established  ear¬ 
lier  for  the  15  dB  case  hold  also  for  the  other  two  levels.  However,  unlike  the  <*>SF 
thresholds  (Figs.  4),  the  average  rms-threshold  monotonically  increases  with  peak  lev¬ 
els  (Figs.  A4).  The  increase  is  small,  being  of  the  order  0.25  dB  per  1  dB  change  in 
peak  level. 

(Hi)  Dependence  on  spectral  density 

rms-Threshold  increases  with  increasing  spectral  density,  from  41  to  11  components 
tests  (Figs.  A5).  Note  that  the  11  component  thresholds  are  lower  than  those  for  the  41 
component  signal.  Some  of  this  difference  is  probably  due  to  masking  effects  among  the 
41  closely  spaced  components  ([ Bernstein  and.  Green ,  1987]).  Another  possible  source 
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SF 


BWF 

Figure  3:  A3:  Symmetry  change  detection  rms-thresholds  for  41  component  complex 
and  15  dB  peak  amplitude,  averaged  over  five  subjects  and:  four  BWF’s  in  (a),  and 

five  SF’s  and  (b).  The  rms-threshold  is  defined  as:  20  log  ^]C"=i(A«,-/a,-)2,  where  Aat 
is  the  change  in  the  amplitude  of  the  ith  component  at  threshold,  «,  is  the  amplitude 
of  the  ith  component  in  the  standard,  and  n  =  41.  rms  Threshold  is  independent  of 
SF  and  BWF.  The  error  bars  are  the  standard  deviations  of  the  means. 
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BWF 


Figure  4:  A4:  Symmetry  change  detection  rms-thresliolds  for  41  component  complex 
and  3  peak  amplitudes:  10  dB,  15  dB,  and  20  dB,  relative  to  baseline.  The  data  are 
averages  of  three  subjects  and:  three  BWF’s  in  (a),  and  three  SF’s  and  (b).  The  values 
along  the  ordinates  are  defined  as  in  Fig.  3.  Points  are  slightly  offset  for  clarity  reason. 
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is  the  large  frequency  spacing  among  the  11  components  which  may  cause  the  task  to 
be  perceived  as  amplitude  changes  in  several  smaller  peaks  rather  than  the  detection 
of  symmetry  in  a  single  (broader)  peak. 

2.  Detection  of  changes  in  spectral  peak  bandwidth  factor 
Stimuli  and  paradigm  are  described  in  Sec.  Ill  of  the  text. 

(i)  Dependence  on  symmetry  and  bandwidth  factors  of  the  standard 

The  data  are  averaged  and  presented  as  described  in  Sec.  III.  Detection  thresh¬ 
olds  are  independent  of  SF  for  all  BWF’s.  However,  they  increase  monotonically  with 
standard’s  BWF.  This  trend  is  more  clearly  depicted  in  Fig.  A6(b),  where  the  rms- 
thresholds  are  averaged  over  the  five  SF’s  and  then  plotted  against  BWF.  The  func¬ 
tional  form  of  this  dependence,  which  best  approximates  the  experimental  data  points 
in  the  least  square  error  sense,  is: 

threshold(dB)^  -  6.85(dB)  -j-  3.3(dB/octave)  log2(10  BWF)  (octave). 

(ii)  Dependence  on  peak  amplitudes 

Data  are  averaged  and  presented  as  in  Sec.  III.  Mean  rms-thresholds  tend  to  in¬ 
crease  with  peak  level  in  a  manner  similar  to  that  seen  earlier  in  the  SF  change  detection 
task. 

(Hi)  Dependence  on  spectral  density 

The  rms-thresholds  increase  monotonically  with  BWF,  and  with  spectral  density 
(Fig.  A8(a)). 
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Figure  5:  A5:  Symmetry  change  detection  thresholds  for  41,  21,  and  11  component 
complexes,  averaged  over  four  subjects  and  three  BWF’s  (a)  and  three  SF’s  (b).  rms- 
Threshold  increases  with  increasing  spectral  density,  from  41  to  11  component  tests. 
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Figure  6:  A6:  (a)  Bandwidth  change  detection  rms-threshold  for  41  frequency  compo¬ 
nents,  15  dB  peak  level,  and  three  BWF’s:  0.1,  0.2,  and  0.4,  averaged  for  three  listeners. 
Thresholds  monotonically  increase  with  BWF,  and  the  form  of  this  dependence,  aver¬ 
aged  over  five  SF’s,  is  depicted  in  (b).  The  dotted  line  in  (b)  is  the  least  square  error- 
linear  approximation  of  this  dependences:  threshold  (dB)  =  -6.85  +  3.3  log2  (10  BWF). 

Data  are  slightly  offset  for  clarity. 
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Figure  7:  A7:  Bandwidth  change  detection  rms-thresholds  for  41  component  complex 
and  3  peak  amplitudes:  10  dB,  15  dB,  and  20  dB.  The  thresholds  are  averages  of  three 
subjects  and:  three  BWF’s  in  (a),  and  three  SF’s  in  (b).  Points  are  offset  for  clarity. 
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Figure  8:  A8:  Bandwidth  change  detection  thresholds  for  41,  21,  and  11  component 
complexes,  averaged  over  three  subjects  and  three  BWF’s  (a)  and  two  SF’s  (b).  rms- 
Thresholds  are  in  general  higher  for  41  than  for  21  and  11  component  cases. 
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Appendix  II:  Brief  review  of  the  Ewaif  model 

An  analytic  function  m(f)  with  envelope  |m(f)|  and  phase  <j>(t)  is  related  to  a  real 
waveform  s(t)  with  Hilbert  transform  s(t)  ([ Oppenheim  and  Schafer,  1990;  Papoulis, 
1962;  Voelcker,  1966]),  as: 


m(t)  —  |m(f)|  —  s(t)  +js(t), 


where 

HOI  =  y/s2(t)  +  s2(t.), 

and 

<f>(t)  —  arctan 

The  equivalent  pitch  of  a  complex  sound  is  defined  by  the  Ewaif  model  as: 


f(0 

s(t)' 


Ewaif  = 


/0T  |m(f)|  instf(f)df 

S  MOM* 


where  T  is  a  stimulus  duration,  and  instf(t)  is  an  instantaneous  frequency  of  s(t), 
defined  as  instf(t)  =  A™. 

For  our  stimulus  of  n  components,  with  ktk  component  amplitude,  frequency  and  phase 
denoted  as:  ak,j\ ,  and  <pk,  respectively,  the  above  definition  translates  to: 


Ewaif  = 


Jo  (EL  1  alfk  +  E£=i  Ej=k+1  ftkO-Ah  +  fj)  cos(2tt (fk  -  fj)t  +  tpk  -  y,))  e(t)dt 

fo  e(0^ 


where 

n  n— 1  n 

e(02  —  ^  aka3  cos(27r (fk  -  fj)t  +  (pk  -  <pj). 

k=  1  k—l  j=k-\-l 
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Appendix  III:  Adding  a  constant  phase  to  the  Fourier  transform  of  the 
profile 

Consider  the  profile  p(u> )  whose  Fourier  transform  is  P(f2): 

1  f°° 

p{u)  =  —  /  P(n)e^Qwd(2TrCl). 

Z7T  J — oo 

Adding  a  constant  phase  angle  0o  to  all  the  transform  components  changes  the 
profile  to: 


/0  r  oo 

P(f))eJ  +  /  P(tt)e-j9°ej2*Q“<m, 

-oo  Jo 

where  the  integral  is  split  to  emphasize  that  the  phase  function  (added  to  negative 
frequencies  and  subtracted  from  positive  frequencies)  must  be  odd  as  a  function  of  Q 
in  order  for  pga  to  remain  real.  This  expression  can  be  simplified  further  by  substituting 
e±j0o  _  COs(#0)  ±j  sin(6l0),  and  collecting  terms: 

/oo  r  oo 

P(tt)ej2*Quj(m  -  sin (0O)  /  jP( a)  ■  sign(fl)  ■  rr2z0-JO. 

■oo  J  —  oo 

Therefore, 

Pe  0(w)  =  cos  (O0)p(u)  +  sin(0o)H(p(a>)), 

where  7i(p(tj))  is  the  so-called  Hilbert  transform  of  p(co).  This  is  the  expression  used 
in  computing  the  profiles  in  Figs.  11.  A  simpler  expression  can  be  used  for  the  case  of 
small  0o  (cos (90)  1  and  sin(0o)  ps  0o)\ 

Pe0(u)  =  p(uf)  +  0oH(p(cj)). 
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